Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors and expands the unit test suite with a focus on LLM client testing. The main improvement is introducing abstract base test classes (TestLLMBase and TestJudgeLLMBase) that define common test patterns for all LLM implementations, along with helper functions and shared fixtures to reduce code duplication.
Changes:
- Introduces base test classes and helper functions for consistent LLM testing across all providers
- Refactors all LLM client tests to inherit from base classes
- Adds coverage validation tests to ensure complete test coverage for new implementations
- Consolidates
last_response_metadatainitialization inLLMInterfacebase class - Extracts
parse_judge_modelsandget_parser()functions from judge.py for better testability - Adds validation for
max_personasparameter and new test cases
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/llm_clients/test_base_llm.py | New abstract base test classes for LLM implementations |
| tests/unit/llm_clients/test_helpers.py | New helper functions for metadata assertions and mock verification |
| tests/unit/llm_clients/conftest.py | New shared fixtures for mock responses and conversation histories |
| tests/unit/llm_clients/test_coverage.py | New automated coverage validation tests |
| tests/unit/llm_clients/README.md | New comprehensive documentation for test architecture |
| tests/unit/llm_clients/test_*_llm.py | Refactored to inherit from base classes and use helpers |
| llm_clients/llm_interface.py | Moved last_response_metadata to base class |
| llm_clients/*_llm.py | Removed duplicate last_response_metadata declarations |
| llm_clients/azure_llm.py | Added role field to metadata |
| judge.py | Extracted get_parser() and uses parse_judge_models() |
| judge/utils.py | New parse_judge_models function |
| tests/unit/judge/test_judge_cli.py | Complete rewrite with comprehensive CLI tests |
| tests/unit/judge/test_*.py | Updated to use proper CLI arg parsing patterns |
| generate_conversations/utils.py | Added max_personas validation |
| tests/unit/conftest.py | New shared mock_system_message fixture |
Comments suppressed due to low confidence (1)
tests/unit/llm_clients/test_azure_llm.py:490
- This method requires 3 positional arguments, whereas overridden TestJudgeLLMBase.test_generate_structured_response_success requires 2.
async def test_generate_structured_response_success(
self, mock_azure_config, mock_azure_model
):
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
emily-vanark
left a comment
There was a problem hiding this comment.
My personal qualms about testing mocks aside, this increases our test coverage from 625 tests with 75.7% coverage to 718 tests with 79.4% converage, and they all pass, so... well done!
Description
This PR cleans, (hopefully) improves, restructures, and adds to the unit testing suite.
The biggest change to note is under tests/unit/llm_clients/test_base_llm.py where
TestLLMBaseandTestJudgeLLMBaseare added as abstract testing classes to ensure their subclasses are properly implemented and tested across similar situations.Example of creating a new llm client without any tests:
Issue
Resolves SAF-145